Symbol emergence by combining a reinforcement learning schema model with asymmetric synaptic plasticity
نویسندگان
چکیده
A novel integrative learning architecture, RLSM with a STDP network is described. This architecture models symbol emergence in an autonomous agent engaged in reinforcement learning tasks. The architecture consists of two constitutional learning architectures: a reinforcement learning schema model (RLSM) and a spike timing-dependent plasticity (STDP) network. RLSM is an incremental modular reinforcement learning architecture. It makes an autonomous agent acquire behavioral concepts incrementally through continuous interactions with its environment and/or caregivers. STDP is a learning rule of neuronal plasticity that is found in cerebral cortices and the hippocampus. STDP is a temporally asymmetric learning rule that contrasts with the Hebbian learning rule. We found that STDP enables an autonomous robot to associate auditory input with its obtained behavioral concepts and to select reinforcement learning modules more effectively. Auditory signals that are interpreted based on obtained behavioral concepts are revealed to correspond to “signs” in Peirce’s semiotic triad. This integrative learning architecture is evaluated in the context of modular learning.
منابع مشابه
Models and metaphors in neuroscience : The role of dopamine in reinforcement learning as a case study
Neuroscience makes use of many metaphors in its attempt to explain the relationship between our brain and our behaviour. In this thesis I contrast the most commonly used metaphor that of computation driven by neuron action potentials with an alternative view which seeks to understand the brain in terms of an agent learning from the reward signalled by neuromodulators. To explore this reinforcem...
متن کاملReinforcement Learning with Modulated Spike Timing-Dependent Synaptic Plasticity Running head: Reinforcement Learning with STDP
Spike timing-dependent synaptic plasticity (STDP) has emerged as the preferred framework linking patterns of pre-and postsynaptic activity to changes in synaptic strength. Although synaptic plasticity is widely believed to be a major component of learning, it is unclear how STDP itself could serve as a mechanism for general purpose learning. On the other hand, algorithms for reinforcement learn...
متن کاملMind model seems necessary for the emergence of communication
We consider communication when there is no agreement about symbols and meanings. We treat it within the framework of reinforcement learning. This framework enables us to talk about emotional coupling and to consider the emergence of communication. We apply different reinforcement learning models in our studies and simplify the problem as much as possible. We show that the modelling of the other...
متن کاملReinforcement learning with modulated spike timing dependent synaptic plasticity.
Spike timing-dependent synaptic plasticity (STDP) has emerged as the preferred framework linking patterns of pre- and postsynaptic activity to changes in synaptic strength. Although synaptic plasticity is widely believed to be a major component of learning, it is unclear how STDP itself could serve as a mechanism for general purpose learning. On the other hand, algorithms for reinforcement lear...
متن کاملSpatio-Temporal Credit Assignment in Neuronal Population Learning
In learning from trial and error, animals need to relate behavioral decisions to environmental reinforcement even though it may be difficult to assign credit to a particular decision when outcomes are uncertain or subject to delays. When considering the biophysical basis of learning, the credit-assignment problem is compounded because the behavioral decisions themselves result from the spatio-t...
متن کامل